In this assignment, I would like to see the correlation of diabetes and obesity with physical inactivity in the US in 2017. Thus, I download two datasets which are talking about “diagnosed diabetes among adults aged >=18 years” and “Obesity among adults aged >=18 years” in the US in 2017 from the CDC. They include estimates for the 500 largest US cities and approximately 28,000 census tracts within these cities.
I used API method to obtain my datasets from CDC. First, you have to create an account with password. Then, you have to apply for a free app token. Last, copy your API Endpoint. Both datasets contain 27 columns and 29,006 rows.
Here are my datasets links:
https://chronicdata.cdc.gov/500-Cities-Places/500-Cities-Obesity-among-adults-aged-18-years/bjvu-3y7d
https://chronicdata.cdc.gov/500-Cities-Places/500-Cities-Diagnosed-diabetes-among-adults-aged-18/cn78-b9bj
After downloading two datasets, I merge them, remove duplicates and NA values, and add a new column of regions.
From the Leaflet, the legend shows the degree of the diabetes
percentage. The red color means higher percentage of diabetes. I see
there are more orange dots in the NE region and SE region from the plot
of diabetes percentage.
Now, let see the boxplot, the x-axis shows 4 regions: Northeast, Southeast, Northwest, and Southwest. On the y-axis shows the percentage of diabetes or obesity.
From the boxplot of diabetes percentage, there is a max diabetes percentage in the NE region, and the NE region and the SE region have a similar median diabetes percentage. The NW region has the lowest median diabetes percentage. In this plot, the east-side regions’ median diabetes percentage is higher than the west-side regions’.
In this scatter plot, I select each state’s median of obesity percentage and diabetes percentage. We can see that there is a positive correlation between obesity and diabetes rates.
From the leaflet, first we can see there are more orange dots on the NE and SE regions. From the box plot, the median of diabetes percentage looks equally high in the NE and SE regions. Besides, we can also see there are higher diabetes percentages on the east-side than on the west-side.
From the scatter plot, we can see that there is a positive correlation between obesity and diabetes rates by states.
Copyright © 2020, Sam Lu.